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Abstract 

An NLO QCD analysis of the ZEUS data on e + p deep inelastic scattering to- 
gether with fixed-target data has been performed from which the gluon and 
quark densities of the proton and the value of the strong coupling parameter, 
a s (M§), have been extracted. The study includes a full treatment of the ex- 
perimental systematic uncertainties, including point-to-point correlations. Dif- 
ferent ways of incorporating correlated systematic uncertainties into the fit are 
discussed and compared. 

1. INTRODUCTION 

Studies of inclusive differential cross sections and structure functions, as measured in deep inelastic 
scattering (DIS) of leptons from hadron targets, have played a crucial role in establishing the theory of 
perturbative quantum chromodynamics (pQCD). Measurement of the structure functions as a function 
of x and Q 2 yields information on the shape of the parton distribution functions (PDFs) and, through 
their Q 2 dependence, on the value of the strong coupling constant a s (M|). Most analyses use the for- 
malism of the next-to-leading-order (NLO) DGLAP evolution equations [jl]] which provide a successful 
description of the data over a broad kinematic range. 

In recent years the uncertainties on PDFs from experimental sources, as well as from model as- 
sumptions, have become an issue. The subject of this paper is an evaluation of the experimental uncer- 
tainties on the extracted PDFs and on the value of a s (M§). Various methods of treating correlated sys- 
tematic uncertainties are discussed. The method selected for the main analysis is conservative, reflecting 
knowledge that such systematic uncertainties are not always Gaussian distributed. Model uncertainties 
have also been estimated. 



2. Description of NLOQCD fit 

Full details of the analysis are given in [Q], here we give only a summary. The kinematics of lepton 
hadron scattering is described in terms of the variables Q 2 , the negative invariant mass squared of the 
exchanged vector boson, Bjorken x, the fraction of the momentum of the incoming nucleon taken by the 
struck quark (in the quark-parton model), and y, which measures the energy transfer between the lepton 
and hadron systems. The differential cross-section for the process e + p — ► e + X is given in terms of the 
structure functions by 

lF ' J 2 ' Tn2 \Y + F 2 (x,Q 2 ) -y 2 F L (x,Q 2 ) - Y_xF 3 (x,Q 2 
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where Y± = 1 ± (1 — y) 2 . The structure functions are directly related to PDFs, and their Q 2 dependence, 
or scaling violation, is predicted by pQCD. For Q 2 < lOOOGeV 2 , F 2 dominates the charged lepton- 
hadron cross-section and for x < 1CT 2 the gluon contribution dominates the Q 2 evolution of F 2 , such 
that HERA data in this kinematic region provide crucial information on quark and gluon distributions. 
(Schematically, F 2 ~ xq, dF 2 /dlnQ 2 ~ a s P qg xg). 

A global fit of ZEUS ^ and fixed target DIS data[|j] has been perfomed. The fixed target data are 
used to provide information on the valence quark distributions and the flavour composition of the sea, 



and to constrain the fits at high x. All data sets used have full information on point-to-point correlated 
systematic uncertainties. (ZEUS e + p cross-section data; NMC, E665, BCDMS \i — p and \i — D F2 
data; CCFR is, v xF% data on an Fe target). A total of 71 sources of systematic uncertainty, including 
normalisation uncertainties, were included. 

The analysis is performed within the conventional framework of leading twist, NLO QCD, with the 
renormalisation and factorization scales chosen to be Q 2 . In the standard fit the following cuts are made 
on the ZEUS and the fixed target data: (i) W 2 > 20GeV 2 to reduce the sensitivity to target mass and 
higher twist contributions, which become important at high x and low Q 2 ; (ii) Q 2 > 2.5GeV 2 to remain 
in the kinematic region where perturbative QCD should be applicable. The heavy quark production 
scheme is the general mass variable flavour number scheme of Thorne and Roberts [g]. 

The DGLAP equations yield the PDFs at all values of Q 2 , provided they are input as functions of x 
at some input scale Qq. The PDFs for u valence, d valence, total sea (xS), gluon (xg) and the difference 
between the d and u contributions to the sea, are each parametrized by the form 

xf(x) = p\x P2 (1 — x) pa (1 + p^x) 

at Qq = 7GeV 2 . The flavour structure of the light quark sea allows for the violation of the Gottfried sum 
rule and the strange sea is suppressed by a factor of 2 at Qq, consistent with neutrino induced dimuon 
data from CCFR. The parameters p% — p% are constrained to impose the momentum sum-rule and the 
number sum-rules on the valence distributions. There are 1 1 free parameters in the standard fit when 
the strong coupling constant is fixed to a s (M|) = 0.118 [§J, and 12 free parameters when a s (M|) is 
determined by the fit. 

3. Definition of x 2 : treatment of correlated systematic uncertainties 

The definition of x 2 used in global fits has traditionally been 
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where ^ NL °Q CD ^ represents the prediction from NLO QCD in terms of the theoretical parameters p; 
Fj(meas) represents a measured data point and the symbols o"i )S t a t> o"i.unc and <Tj corr represent its error 
from statistical, uncorrected and correlated systematic sources, respectively. 

However such a definition does not take into account the correlations of the correlated systematic 
errors. Hence it has been modified to 
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Eq.([I]) shows how the theoretical prediction is modified to include the effect of the correlated system- 
atic uncertainties. The one-standard-deviation systematic uncertainty on data point i due to source A is 
referred to as A^ s and the parameters s\ represent independent Gaussian random variables with zero 
mean and unit variance for each source of systematic uncertainty. There are then several different ways 
to proceed, as discussed below. 



3.1 Offset methods 

The systematic uncertainty parameters s\ can be fixed to zero so that the fitted theoretical predictions are 
as close as possible to the central values of the published data. However, the s\ are allowed to vary for 
the error analysis, such that in addition to the usual Hessian matrix, Mjk, given by 
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which is evaluated with respect to the theoretical parameters, a second Hessian matrix, Cj\, given by 
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is evaluated. The systematic covariance matrix is then given by V ps = M~ l CC T M" 1 [^] and the total 
covariance matrix by V tot = V p + V ps , where V p = M~ l . Then the uncertainty on any distribution F 
may be calculated from 

. Ap2 ^ _^^9F_y 9F_ 

by substituting V p , V ps or V tot for V, to obtain the statistical (and uncorrected systematic), correlated 
systematic or total experimental error band, respectively. 

This method of accounting for systematic uncertainties will be called the 'offset method' since its 
results are equivalent to those of the traditional offset method used by experimentalists, in which each s\ 
is varied by its assumed uncertainty (±1) (such that the data points are shifted to account for systematic 
error A) a new fit is performed for each of these variations, and the resulting deviations of the theoretical 
parameters from their central values are added in quadrature (Positive and negative deviations are 
added in quadrature separately). This is not a statistically rigorous procedure, but its virtue is that it 
does not assume that the systematic errors are necessarily Gaussian distributed. It gives a conservative 
estimate of the error as compared to the Hessian methods [M 0|, which will be described below. 



3.2 Hessian methods 

An alternative procedure would be to allow the systematic uncertainty parameters s\ to vary in the main 
fit when determining the values of the theoretical parameters. This method is referred to as 'Hessian 
method 1*. The errors on the theoretical parameters are then calculated from the inverse of a single 
Hessian matrix which expresses the variation of x 2 with respect to both theoretical and systematic offset 
parameters. Effectively, the theoretical prediction is not fitted to the central values of the published ex- 
perimental data, but allows these data points to move within the tolerance of their correlated systematic 
uncertainties. It is necessary to check that points are not moved far outside their one standard deviation 
systematic uncertainty estimates. The theoretical prediction determines the optimal settings for corre- 
lated systematic shifts of experimental data points such that the most consistent fit to all data sets is 
obtained. Thus systematic shifts in one experiment are correlated to those in another experiment by the 
fit. 

Hessian method 1 becomes an impractical procedure when the number of sources of systematic 
uncertainty is large, as in the present global DIS analysis in which 7 1 independent sources of systematic 
uncertainty were included. Recently CTEQ [10] have given an elegant analytic method for performing 
the minimization with respect to systematic-uncertainty parameters. This gives a new formulation of the 
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such that the uncorrelated and systematic contributions to the x 2 can be evaluated separately. This 
method is referred to as 'Hessian method 2'. 

The results for the ZEUS fit analysis are compared for these methods below. 

4. Fit results: experimental and model uncertainties 
4.1 Experimental undertainties: offset method 

The standard fit has been perfomed treating the experimental correlated systematic errors by the offset 
method, with a s (M|) = 0.118 fixed. The fit gives an excellent description of the high-precision ZEUS 
data and of the fixed-target data, see ref [§]. The sea and the gluon PDFs extracted from this fit are shown 
in Fig. [l[ These PDFs agree well with the latest distributions from MRST2001 [JTl]] and CTEQ6 @. 



The error bands shown in the figure illustrate the experimental uncertainties from i) statistical and uncor- 
related systematic uncertainties alone; ii) total experimental uncertainty including correlated systematic 
uncertainties; and ii) additional uncertainty due to allowing a s (M|) to be a parameter of the fit. 

Clearly, in the latter case, the fit also determines the value of a s (M§), with its correlations to the 
PDF parameters fully accounted. The value 

a s (Mz) = 0.1166 ± 0.0008(uncorr) ± 0.0032(corr) ± 0.0036(norm) (2) 

is obtained, where the three uncertainties arise from the following: statistical and other uncorrelated 
sources; correlated systematic sources from all contributing experiments except that from their normali- 
sations; the contribution from the latter normalisations. The contribution from normalisation uncertain- 
ties is shown separately from the other systematic uncertainties since, for many experiments, quoted 
normalisation uncertainties represent the limits of a box-shaped distribution rather than the standard de- 
viation of a Gaussian distribution. An alternative evaluation of normalisation uncertainty, which accounts 
for this, would reduce this contribution to ±0.0010. 



4.2 Model uncertainties 

In addition to the experimental uncertainty on the fitted parameters, there is potentially a model un- 
certainty due to the specific assumptions made when setting up the NLOQCD fit. Sources of model 
uncertainty within the theoretical framework of leading-twist NLO QCD are: the effect of varying the 
value of Qq, and the minimum Q 2 , x and W 2 of data entering the fit; variation of the form of the in- 
put PDF parameterisations; the choice of the heavy-quark production scheme. The sensitivity of the 
results to the variation of these input assumptions has been quantified in terms of the resulting variation 
in a s (M§), since it is the most sensitive parameter. This leads to a model uncertainty in a s (M§), of 
Aa s (M§) ~ ±0.0018, considerably smaller than the errors from correlated systematic and normalisa- 
tion uncertainties. The PDF parameters are much less sensitive to the model assumptions than a s (M^). 
It follows that the error bands illustrated on the parton densities in Fig. [j] represent reasonable estimates 
of the total uncertainties within the theoretical framework of leading twist NLO QCD. 

Sources of uncertainty due to the theoretical framework itself are considered in the contribution of 
R.Thorne to this meeting. 
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Fig. 1: Comparison of the gluon and sea distributions from the ZEUS-S NLO QCD fit for various Q 2 values. In this figure, 
the cross-hatched error bands show the statistical and uncorrelated systematic uncertainty, the grey error bands show the total 
experimental uncertainty including correlated systematic uncertainties (both evaluated from the standard fit with a a {M%) = 
0.118) and the hatched error bands show the additional uncertainty coming from variation of the strong coupling constant 
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Table 1: Table of \ 2 calculated by adding systematic and statistical errors in quadrature for the theoretical parameters deter- 
mined by the offset method and Hessian method 2 



4.3 Comparison of Offset and Hessian methods 

First Hessian method 1 and Hessian method 2 have been compared in a fit to ZEUS data alone, for which 
the systematic uncertainties and relative normalisations are well understood. The results are very similar, 
as expected if the systematic uncertainties are Gaussian and the values A^ s represent one standard 
deviation uncertainties. However, if data sets from different experiments are used in the fit, these two 
Hessian methods are only similar if normalisation uncertainties are not included among the systematic 
uncertainties. Normalisation uncertainties are not always Gaussian and thus the analytic procedure of 
Hessian method 2 is inappropriate for them. 

The offset method has been compared to Hessian method 2 by performing the fit to global DIS 
data using Hessian method 2 to calculate the % 2 . Normalisation uncertainties were excluded and a s (M|) 
was included as one of the theoretical parameters. This fit yields, a s (M§) = 0.1120±0.0013, where the 
error represents the total experimental uncertainty from correlated and uncorrected sources, excluding 
normalisation uncertainties. Thus this value should be compared with, a s (M§) = 0.1166±0.0033, eval- 
uated using the offset method, also excluding normalisation uncertainties (see Eq. ||). Hessian method 
2 gives a much reduced 'optimal' error estimate for a s (Af|) and this is also the case for the PDF pa- 
rameters. The value of a s (M§) is shifted from that obtained by the offset method. The PDF parameters 
are not affected as strongly, their values are shifted by amounts which are well within the error estimates 
quoted for the offset method. 

To compare the x 2 of the fits done using the offset method and Hessian method 2, it is necessary to 
use a common method of \ 2 calculation. Table |l| presents the \ 2 f° r the theoretical parameters obtained 
using these methods, re-evaluated by adding statistical and systematic errors in quadrature. For both 
methods a s (M|) has been included among the theoretical parameters and normalisation uncertainties 
have not been included among the systematic parameters. The total increase of x 2 for Hessian method 2 
as compared to the offset method is A% 2 = 283. The results of Hessian method 2 represent a fit with an 
unacceptably large value of \ 2 when judged in this conventional way. 

4.4 Parameter evaluation and Hypothesis Testing 

To appreciate the signficance of the difference in \ 2 between various fits, the distinction between the x 2 
changes appropriate for parameter estimation and for hypothesis testing should be considered. Assum- 
ing that the experimental uncertainties which contribute have Gaussian distributions, errors on theoretical 
parameters which are fitted within a fixed theoretical framework are derived from the criterion for 'pa- 
rameter estimation' \ 2 ~^ Xmin + 1- However the goodness of fit of a theoretical hypothesis is judged on 
'hypothesis testing' criterion, such that its \ 2 should be approximately in the range N ± y/(2N), where 
N is the number of degrees of freedom. 



Fitting DIS data for PDF parameters and a s (M§) is not a clean situation of either parameter 
estimation or hypothesis testing. Within the theoretical framework of leading-twist-NLO QCD, many 
model inputs such as the form of the PDF parameterisations, the values of cuts, the value of Qq, the 
data sets used in the fit, etc., can be varied. These represent different hypotheses and they are accepted, 
provided the fit x 2 fall within the hypothesis-testing criterion. The theoretical parameters obtained for 
these different model hypotheses can differ from those obtained in the standard fit by more than their 
errors as evaluated using the parameter-estimation criterion. In this case the model uncertainty on the 
parameters exceeds the estimate of the total experimental error. This does not happen for the offset 
method in which the uncorrected experimental errors evaluated by the parameter estimation criterion 
are augmented by the contribution of the correlated experimental systematic uncertainties as explained 
in Section |3]. The shifts in theoretical parameter values for the different model hypotheses are found to 
be well within the total conservative experimental error estimates. However this is no longer the case 
when the fit is performed using Hessian method 2. In this case the shifts in theoretical parameter values 
for the different model hypotheses are outside the experimental error estimates. Since our purpose is to 
estimate errors on the PDF parameters and a s (M|) within a general theoretical framework, rather than 
as specific to particular model choices within this framework, this is an issue which must be addressed. 

The CTEQ collaboration Jio| , |l2|] have considered this problem. They consider that x 2 — ► X 2 + 1 
is not a reasonable tolerance on a global fit to ~ 1200 data points from diverse sources, with theoretical 
and model uncertainties which are hard to quantify and experimental uncertainties which may not be 
Gaussian distributed. They have tried to formulate criteria for a more reasonable setting of the tolerance 
T, such that x 2 ~ * X 2 + T 2 becomes the variation on the basis of which errors on parameters are 
calculated. In setting this tolerance they have considered that all of the current world data sets must be 
acceptable and compatible at some level, even if strict statistical criteria are not met, since the conditions 
for the application of strict criteria, namely Gaussian error distributions, are also not met. The level 
of tolerance they suggest is T ~ 10. Note that this is similar to the hypothesis testing tolerance T = 
\/-\/(2iV) ~ 7 for the ZEUS fits. The errors for Hessian method 2 have been re-evaluated using the 
tolerance T = 7 and, for a s (M§), the result a s (M|) = 0.1120 ± 0.0033 is obtained. The error is now 
remarkably close to the error estimate of the offset method performed under the same conditions. This is 
also the case for the errors on all of the PDF parameters. 

Thus the offset method and the Hessian method with an augmented tolerance T = y/y/(2N) 
give similar conservative error estimates. In choosing between these methods there are some additional 
considerations. 

In the Hessian method it is necessary to check that data points are not shifted far outside their one 
standard deviation errors. When the ZEUS fits are done by Hessian method 2 some of the systematic 
shifts for the 10 classes of systematic uncertainty of the ZEUS data move by ~ ±1.4 standard deviations. 
There is no single kinematic region responsible for these shifts which could be excluded to reduce this 
effect. Whereas these shifts are not very large, they differ significantly from the systematic shifts to 
ZEUS data determined in the CTEQ fit. The choice of data sets included in the ZEUS fit also changes 
the values of these systematic shifts and making different model assumptions in the fits also produces 
somewhat different systematic shifts. It seems unreasonable to let variations in the model, or the choice 
of data included in the fit, change the best estimate of the central value of the data points. 

In summary, the offset method has been selected for several reasons. Firstly, because its fit results 
make theoretical predictions which are as close to the central values of the published data points as pos- 
sible. The selection of data sets included in the fit or superficial changes to the model are not allowed 
to change the best estimate of the central value of the data points. Secondly, because its error estimates 
are equivalent to those of a method which does not assume that experimental systematic uncertainties 
are Gaussian distributed. Thirdly, because its results produce an acceptable x 2 when re-evaluated con- 
ventionally by adding systematic and statistical errors in quadrature. Fourthly, because its conservative 
error estimates take account of the fact that the purpose is to estimate errors on the PDF parameters 



and a s (M§) within a general theoretical framework not specific to particular model choices. Quantita- 
tively the error estimates of the offset method correspond to those which would be obtained using the 
more generous tolerance of the 'hypothesis testing' criterion in the more statistically rigorous Hessian 
methods. 

5. CONCLUSION 

The NLO DGLAP QCD formalism has been used to fit ZEUS data and fixed-target data in the kinematic 
region, Q 2 > 2.5GeV 2 , 6.3 x 10~ 5 < x < 0.65 and W 2 > 20GeV 2 . The parton distribution functions 
for the u and d valence quarks, the gluon and the total sea have been determined and the results are 
compatible with those of MRST2001 and CTEQ6. The ZEUS data are crucial in determining the gluon 
and the sea distributions. 

Full account has been taken of correlated experimental systematic uncertainties for all experi- 
ments. The resulting experimental uncertainties on the parton distribution functions have been evaluated 
conservatively, such that the model uncertainty, within the framework of leading-twist next-to-leading- 
order QCD, is negligible by comparison. Hence the error bands on the PDFs resulting from these fits 
represent reasonable estimates of the total uncertainties within this theoretical framework. 
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